Adaptive optimum cdr bandwidth estimation by using a kalman gain extractor

ABSTRACT

Exemplary embodiments of the present invention relate to a clock and data recovery (CDR) apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor. The Kalman gain extractor includes an off chip digital processor which receives a phase update information from the CDR outputs an estimated optimum Kalman gain obtained by extracting the standard deviation of step sizes of the accumulation jitter from the power spectral density (PSD) of the phase update information, and a on chip digital loop filter consists of a cyclic accumulator which accumulates the phase detector&#39;s output, a gain multiplier and a phase interpolator (or DCO) controller. The off chip digital processor includes a storage register, a fast Fourier transform (FFT) processor and an optimum Kalman gain estimator. The storage register stores the phase update information, from which the FFT processor extracts the PSD of the absolute input jitter. The optimum Kalman gain estimator calculates the optimum gain from the PSD of the accumulation jitter. The off chip digital processor may further include a gain calibrator to compensate for the variations in the transition density.

BACKGROUND OF THE INVENTION

1. FIELD OF THE INVENTION

Exemplary embodiments of the present invention relate to a clock and data recovery (CDR) apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor.

2. Discussion of the Background

The input jitter of a clock and data recovery (CDR) can be modeled as the sum of the accumulation and periodic jitter. The periodic jitter does not accumulate over time and has bounded variance in general.

Data-dependent deterministic jitter is a subset of the periodic jitter. The accumulation jitter, on the contrary, is unbounded in nature and increases indefinitely with time, thus a CDR has to track it for bit-error-free operation.

The analogy between the Kalman filter and a bang-bang (BB) CDR is utilized for the analytical minimum bounds of the mean squared phase error of a BB CDR circuit under the condition of random phase tracking.

SUMMARY OF THE INVENTION

An exemplary embodiment of the present invention discloses a clock and data recovery (CDR) apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor comprising a clock generator configured to provide frequency locked clocks to a digitally controlled phase rotator, a bang-bang phase detector (BBPD) and a Kalman gain extractor configured to estimate an optimum Kalman gain for a loop filter connected to the BBPD and the phase rotator.

The Kalman gain extractor includes an on-chip digital loop filter and an off-chip digital processor to receive phase update information from the CDR apparatus and output the optimum Kalman gain.

The on-chip digital loop filter includes a cyclic accumulator, a gain multiplier and a phase interpolator controller.

The off-chip digital processor outputs the optimum Kalman gain obtained by extracting a standard deviation of step sizes of an accumulation jitter from power spectral density (PSD) of the phase update information.

The off-chip digital processor includes, a storage register configured to store the phase update information and a fast Fourier transform (FFT) processor configured to extract PSD of an absolute input jitter from the phase update information.

The off-chip digital processor further includes, an optimum Kalman gain estimator configured to calculate the optimum Kalman gain from the PSD of an accumulation jitter.

The off-chip digital processor further includes, a gain calibrator configured to compensate for variations in transition density.

The Kalman gain extractor includes, a Kalman filter configured to find the optimum Kalman gain by minimizing a posterior MSE recursively.

The BBPD includes a demultiplexer modeled by parallel BBPDs with a subsequent summation block.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention, and together with the description serve to explain the principles of the invention.

FIG. 1 is a view illustrating an example of Z-domain block diagram of a typical rotator-based BB CDR with a clock generator according to an exemplary embodiment of the present invention.

FIG. 2 is a view illustrating an example of discrete time model of input jitter according to an exemplary embodiment of the present invention.

FIG. 3 shows typical shape of jitter tolerance mask according to an exemplary embodiment of the present invention.

FIG. 4 is a view illustrating an example of phase-domain discrete-time model of a linearized BB CDR and input jitter according to an exemplary embodiment of the present invention.

FIG. 5 shows output MSE (Mean Squared Error) of the single BB CDR with

$\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{WI}_{{rm}\; s}}$

according to an exemplary embodiment of the present invention.

FIG. 6 is a view illustrating an example of the linearized discrete-time block diagram of a 1:M demultiplexed parallel BBPD (bang-bang phase detector) according to an exemplary embodiment of the present invention.

FIG. 7 shows output MSE of the demuxed BB CDR versus M, with

$\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{UI}_{{rm}\; s}}$

and σ_(N)=0.158Ul _(rms) according to an exemplary embodiment of the present invention.

FIG. 8 shows the analytical and simulated MSEs of a 1:8 demultiplexed BB CDR with Kalman gain according to an exemplary embodiment of the present invention.

FIG. 9 shows the simulated autocorrelation of e_(n,m) for various levels of parallelization, loop delays, and input accumulation jitters according to an exemplary embodiment of the present invention.

FIG. 10 shows the analytical and simulated MSEs with various loop gains according to an exemplary embodiment of the present invention.

FIG. 11 shows the relationship between the minimum MSE bound and loop delay D under various demultiplexing ratios according to an exemplary embodiment of the present invention.

FIG. 12 shows B_(Dn)θ_(pr) versus σ_(N) and σ_(W) ² according to an exemplary embodiment of the present invention.

FIG. 13 shows the ratio between the output MSEs and optimum gain B_(DN)θ_(pr) versus MDσ_(w) and σ_(N) according to an exemplary embodiment of the present invention.

FIG. 14 is a view illustration an example of the block diagram of the Kalman gain extractor according to an exemplary embodiment of the present invention.

FIG. 15 is a view illustration an example of the block diagram of the off chip digital processor according to an exemplary embodiment of the present invention.

DETAILED OF THE ILLUSTRATED EMBODIMENTS

The invention is described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the invention are shown. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure is thorough, and will fully convey the scope of the invention to those skilled in the art. In the drawings, the size and relative sizes of layers and regions may be exaggerated for clarity. Like reference numerals in the drawings denote like elements.

Hereafter, an exemplary embodiment of the present invention will be described in detail with reference to the accompanying drawings. It is noted that the same reference numerals are used to denote the same elements throughout the drawings. In the following description of the present invention, the detailed description of known functions and configurations incorporated herein is omitted when it may make the subject matter of the present invention unclear.

FIG. 1 is a view illustrating an example of Z-domain block diagram of a typical rotator-based BB CDR (Bang-bang clock and data recovery) with a clock generator according to an exemplary embodiment of the present invention. A clock generator may provide frequency locked clocks to a digitally controlled phase rotator. Analysis in this specification may be restricted to the phase rotator loop shown in the shaded area. For an exemplary embodiment of the present invention, CDR (clock and data recovery) apparatus comprising CDR circuit to execute CDR can be modeled as the following CDR model. The CDR model may consist of a BBPD (bang-bang phase detector), a loop filter with the gain and delay of β and D, respectively, and a digitally controlled phase rotator. The gain of a phase rotator θ_(pr) may be related to its resolution, as given by the following Equation 1.

$\begin{matrix} {\theta_{pr} = \frac{1{UI}}{2^{RotatorResolution}}} & \left\lbrack {{Equation}\mspace{14mu} 1} \right\rbrack \end{matrix}$

The input jitter of a CDR can be modeled as the sum of the accumulation and non-accumulative period jitter. The non-accumulative period jitter may not accumulate over time and have bounded variance in general. Data-dependent deterministic jitter may be a subset of the non-accumulative jitter. The accumulation jitter, on the contrary, may be unbounded in nature and increases indefinitely with time, thus a CDR may have to track it for bit-error-free operation.

FIG. 2 is a view illustrating an example of discrete time model of input jitter according to an exemplary embodiment of the present invention. φ_(d,n) and N_(n) may denote the accumulation and non-accumulative period jitter, respectively, at time index n. The accumulation jitter may be modeled by a discrete time random walk process. By using the Z-transform, the power spectral density of the accumulation jitter may be shown as the following Equation 2.

$\begin{matrix} {{S(f)} = {\frac{E\left\lbrack W^{2} \right\rbrack}{\left( {1 - z^{- 1}} \right)\left( {1 - z} \right)}{_{z = ^{{- {j2\pi}}\; {f/f_{Data}}}}{{= \frac{E\left\lbrack W^{2} \right\rbrack}{4{\sin^{2}\left( {f\; {\pi/f_{Data}}} \right)}}},}}}} & \left\lbrack {{Equation}\mspace{14mu} 2} \right\rbrack \end{matrix}$

where E[W²] may may be the variance of random period jitter W, and f_(Data) may be the data rate. By taking the bilinear transformation of the Equation 2 for simplicity, the following Equation 3 can be derived.

$\begin{matrix} {{S(f)} = \frac{{E\left\lbrack W^{2} \right\rbrack}\left( {1 + \left( {f\; {\pi/f_{Data}}} \right)^{2}} \right)}{\left( {2f\; {\pi/f_{Data}}} \right)^{2}}} & \left\lbrack {{Equation}\mspace{14mu} 3} \right\rbrack \end{matrix}$

S(f) may decrease by −20 dB/decade as frequency increases.

A jitter tolerance mask may provide the information on the accumulation and random non-accumulative period jitter of a serial link. FIG. 3 shows typical shape of jitter tolerance mask according to an exemplary embodiment of the present invention. The accumulation jitter may dominate at low frequencies and decrease by −20 dB/decade as frequency increases. In a SONET (Synchronous optical network) jitter tolerance mask, the magnitude of random non-accumulative period jitter may intersect with the accumulation jitter at 1/2500th of the data rate.

The magnatidue of S(f) can be estimated with the jitter tolerance mask since it may represent the maximum permissible jitter present in a communication link. Even if the practical jitter in a link is hardly composed of sinusoids, the jitter tolerance specification may be defined with sinusoids for testing purposes. In practice, the jitter in serial links carrying real traffic may be more like random noise.

Appropriate values for σ_(W) and σ_(N) can be estimated by matching the variances of the modeled jitter in FIG. 2 with that of a sinusoid defined in the jitter tolerance mask. Let the magnitude of the jitter tolerance mask be l(f), and W and N are white Gaussian processes. |S(f)| may have to satisfy |S(f)|=|j(f)|²/8. For a SONET jitter mask, σ_(W) and σ_(N) may be

$\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{UI}_{{rm}\; s}$

and 0.053Ul_(rms) respectively, and σ_(N)>>σ_(W).

FIG. 4 is a view illustrating an example of phase-domain discrete-time model of a linearized BB CDR and input jitter according to an exemplary embodiment of the present invention. A nonlinear BBPD may be linearized by using a Markov chain analogy in phase lock. The linearized BBPD may consist of linear gain K_(bbpd) with quantization noise φ_(bbpd). The equivalent gain K_(bbpd) may be given by the following Equation 4.

$\begin{matrix} {K_{bbpd} = {\frac{1}{\sqrt{2\pi}\sigma_{J}}\left\lbrack {1 + ^{{- \frac{1}{2}}{(\frac{\beta \; \theta_{p\; r}}{\sigma_{J}})}^{2}}} \right\rbrack}} & \left\lbrack {{Equation}\mspace{14mu} 4} \right\rbrack \end{matrix}$

where σ_(j) may be the standard deviation of the relative input Gaussian jitter φ_(j)=φ_(in)−φ_(out). φ_(bbpd) which can be modeled by a white randon process uncorrelated with φ_(j) if σ_(j)>>βθ_(pr). The standard deviation of φ_(bbpd) may be approximately 0.750σ_(j). In case σ_(j)≦0.5βθ_(pr), the dynamics of a BB CDR may be merely nonlinear.

n-th prediction error e_(n) may be expressed as the following Equation 5.

e _(n)=φ_(d,n)−φ_(out,n)   [Equation 5]

where φ_(d,n) and φ_(out,n) may be the n-th desired and the output clock phases, respectively. If computational latency D is neglected, for simplicity, the n+1-th prediction error e_(n+1) may be recursively given by the following Equation 6.

e _(n+1)=φ_(d,n+1)−φ_(out,n+1)=(1−K _(bbpd)βθ_(pr))e _(n) +W _(n) −K _(bbpd)βθ_(pr)(N _(n)+φ_(bbpd,n))   [Equation 6]

The MSE (Mean Squared Error) of the n+1-th prediction error may be expressed as the following Equation 7.

$\begin{matrix} {{E\left\lbrack e_{n + 1}^{2} \right\rbrack} = {{\left( {1 - {K_{bbpd}{\beta\theta}_{pr}}} \right)^{2}{E\left\lbrack e_{n}^{2} \right\rbrack}} + {E\left\lbrack W_{n}^{2} \right\rbrack} + {K_{bbpd}^{2}\beta^{2}{\theta_{pr}^{2}\left( {{E\left\lbrack N_{n}^{2} \right\rbrack} + {\frac{9}{16}\sigma_{J}^{2}}} \right)}}}} & \left\lbrack {{Equation}\mspace{14mu} 7} \right\rbrack \end{matrix}$

where E[φ_(bbpd,n) ²]≈9σ_(j) ²/16 under phase lock. Provided that the CDR bandwidth may be sufficiently large to track the accumulation jitter, σ_(j) ² may be approximately E[W²]+E[N²]. By setting E[e_(n+1) ²]=E[e_(n) ²]=E[e_(∞) ²], the steady state MSE may be given by the following Equation 8.

$\begin{matrix} {{{E\left\lbrack e_{\infty}^{2} \right\rbrack} = \frac{{\left( {1 + {\frac{9}{16}K_{bbpd}^{2}\beta^{2}\theta_{pr}^{2}}} \right){E\left\lbrack W^{2} \right\rbrack}} + {\frac{25}{16}K_{bbpd}^{2}\beta^{2}\theta_{pr}^{2}{E\left\lbrack N^{2} \right\rbrack}}}{{2K_{bbpd}\beta \; \theta_{pr}} - {K_{bbpd}^{2}\beta^{2}\theta_{pr}^{2}}}}\mspace{20mu} {where}\mspace{20mu} {{E\left\lbrack W_{n}^{2} \right\rbrack} = {{{E\left\lbrack W^{2} \right\rbrack}\mspace{14mu} {and}\mspace{14mu} {E\left\lbrack N_{n}^{2} \right\rbrack}} = {{E\left\lbrack N^{2} \right\rbrack}.}}}} & \left\lbrack {{Equation}\mspace{14mu} 8} \right\rbrack \end{matrix}$

FIG. 5 shows the analytical and simulated MSEs of a BB CDR with

$\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{{UI}_{r\; m\; s}.}}$

The gain of the loop filter may be set β=1. The behavioral simulation results may validate the theoretical analysis in the meaningful σ_(N) range.

High-speed digital domain CDRs typically may make parallel demultiplexed subrate phase updates due to timing constraints of digital logic blocks.

FIG. 6 is a view illustrating an example of the linearized discrete-time block diagram of a 1:M demultiplexed parallel BBPD according to an exemplary embodiment of the present invention. A demultiplexer may be modeled by parallel BBPDs with a subsequent summation block.

e_(n,m) may be the n-th prediction error of the m-th channel in the set of parallel BBPDs as given by e_(n,m)=φ_(d,n,m)−φ_(out,n,m). The time and channel indices may satisfy −∞<n<∞and 0<m≦M, respectively, where k indices may satisfy M may be the level of parallelization. The linearized gain of the m-th BBPD, K_(bbpd,m) may be 2/√{square root over (2π/(mσ_(w) ²+σ_(N) ²))}, since the random jitter W may be accumulated for m cycles. In the case of σ_(w)<<σ_(N), this linearized gain may become insensitive to the channel index m and can be approximated as K_(bbpd,m)≈2/(√{square root over (2π)}σ_(N))=K_(bbpd). A recursive equation for the n+1-th prediction error of the first channel, e_(n+1,1) is given by the following Equation 9.

e _(n+1,1) =e _(n,M) +W _(n,1)−(Me _(n,1)+Σ_(k=2) ^(M)(M+1−k)W _(n−1,k)+Σ_(k=1) ^(M)(N _(n,k)+φ_(bbpd,n,k)))K _(bbpd)βθ_(pr)   [Equation 9]

e_(n,m) may be related to e_(n,1) by the following Equation 10 since the phase updates occur every M-th input signal.

e _(n,m) =e _(n,1)+Σ_(k=2) ^(m) W _(n−1.k)   [Equation 10]

The following Equation 11 may be derived by substituting Equation 10 into Equation 9,

e _(n+1,1) =e _(n,1)+Σ_(k=2) ^(M) W _(n−1) +W _(n,1)−(Me _(n,1)+Σ_(k=2) ^(M)(M+1−k)W _(n−1,k)+Σ_(k=1) ^(M)(N _(n,k)+φ_(bbpd,n,k)))K _(bbpd)βθ_(pr).   [Equation 11]

The MSE of the first channel is given by the following Equation 12.

$\begin{matrix} {{{E\left\lbrack e_{{n + 1},1}^{2} \right\rbrack} = {{\left( {1 - {{MK}_{bbpd}\beta \; \theta_{pr}}} \right)^{2}{E\left\lbrack e_{n,1}^{2} \right\rbrack}} + {\sum\limits_{k = 1}^{M - 1}{\left( {1 - {\left( {M - k} \right)K_{bbpd}\beta \; \theta_{pr}}} \right)^{2}{E\left\lbrack W_{{n - 1},{k + 1}}^{2} \right\rbrack}}} + {\frac{9}{16}\frac{{M\left( {M + 1} \right)}\left( {{2M} + 1} \right)}{6}\beta^{2}K_{bbpd}^{2}\theta_{pr}^{2}{E\left\lbrack W^{2} \right\rbrack}} + {E\left\lbrack W^{2} \right\rbrack} + {\frac{25}{16}M\; \beta^{2}K_{bbbpd}^{2}\theta_{pr}^{2}{E\left\lbrack N^{2} \right\rbrack}}}}\mspace{20mu} {where}\mspace{20mu} {{E\left\lbrack \left( {\sum\limits_{k = 1}^{M}\varphi_{{bbpd},n,k}} \right)^{2} \right\rbrack} = {\left( {9/16} \right){\sum\limits_{k = 1}^{M}\left( {\sigma_{N}^{2} + {k^{2}\sigma_{W}^{2}}} \right)}}}} & \left\lbrack {{Equation}\mspace{14mu} 12} \right\rbrack \end{matrix}$

phase lock. By defining the MSE at n+1-th clock cycle as the average MSE among M parallel channels, the following Equation 13 can be obtained.

$\begin{matrix} {{E\left\lbrack e_{n + 1}^{2} \right\rbrack} = {\frac{\sum\limits_{k = 1}^{M}{E\left\lbrack e_{{n + 1},k}^{2} \right\rbrack}}{M} = {{E\left\lbrack e_{{n + 1},1}^{2} \right\rbrack} + {\frac{M - 1}{2}{E\left\lbrack W^{2} \right\rbrack}}}}} & \left\lbrack {{Equation}\mspace{14mu} 13} \right\rbrack \end{matrix}$

By substituting Equation 12 into Equation 13, the following Equation 14 can be derived.

$\begin{matrix} {{{{E\left\lbrack e_{n + 1}^{2} \right\rbrack} = {{\left( {1 - {{MK}_{bbpd}\beta \; \theta_{pr}}} \right)^{2}{E\left\lbrack e_{n}^{2} \right\rbrack}} + {\left( {M + {\begin{pmatrix} {{\frac{9}{16}\frac{{M\left( {M + 1} \right)}\left( {{2M} + 1} \right)}{6}} -} \\ \frac{{M\left( {M - 1} \right)}\left( {M + 1} \right)}{6} \end{pmatrix}\beta^{2}K_{bbpd}^{2}\theta_{pr}^{2}}} \right){E\left\lbrack W^{2} \right\rbrack}} + {\frac{25}{16}M\; \beta^{2}K_{bbpd}^{2}\theta_{pr}^{2}{E\left\lbrack N^{2} \right\rbrack}}}},\mspace{20mu} {{{where}\mspace{14mu} {E\left\lbrack W_{{n - 1},k}^{2} \right\rbrack}} = {{E\left\lbrack W_{n,k}^{2} \right\rbrack} = {{E\left\lbrack W^{2} \right\rbrack}\mspace{14mu} {and}}}}}\mspace{20mu} {{E\left\lbrack N_{{n - 1},k}^{2} \right\rbrack} = {{E\left\lbrack N_{n,k}^{2} \right\rbrack} = {{E\left\lbrack N^{2} \right\rbrack}.}}}} & \left\lbrack {{Equation}\mspace{14mu} 14} \right\rbrack \end{matrix}$

The steady state MSE may given by the following Equation 15.

$\begin{matrix} {{{E\left\lbrack e_{\infty}^{2} \right\rbrack} = \frac{{{ME}\left\lbrack W^{2} \right\rbrack} + {M\; \beta^{2}K_{bbpd}^{2}\theta_{pr}^{2}\eta}}{{2{MK}_{bbpd}\beta \mspace{2mu} \theta_{pr}} - {M^{2}K_{bbpd}^{2}\beta^{2}\theta_{pr}^{2}}}}{{{where}\mspace{14mu} \eta} = {{\lambda \; {E\left\lbrack W_{n}^{2} \right\rbrack}} + {\frac{25}{16}{E\left\lbrack N_{n}^{2} \right\rbrack}\mspace{14mu} {and}}}}{\lambda = {{\frac{9}{16}\frac{\left( {M + 1} \right)\left( {{2M} + 1} \right)}{6}} - {\frac{\left( {M - 1} \right)\left( {M + 1} \right)}{6}.}}}} & \left\lbrack {{Equation}\mspace{14mu} 15} \right\rbrack \end{matrix}$

FIG. 7 shows output MSE of the demuxed BB CDR versus M, with

$\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{UI}_{rms}}$

and σ_(N)=0.158 Ul_(rms) according to an exemplary embodiment of the present invention. The MSE may increase in proportion to M since the phase update latency degrades the tracking performance.

The Kalman filter may be a discrete time minimum MSE estimator that finds the optimum Kalman gain by minimizing the posterior MSE recursively. The tracking error in a BB CDR can be minimized by incorporating the Kalman filter algorithm in selecting the optimum forward gain β. The optimum Kalman gain may achieve the optimum balance between tracking the accumulation jitter and filtering the non-accumulative period jitter.

B_(n) may be β at time index n. By taking the derivative of E[e_(n+1) ²] in Equation 14 with respect to B_(n), the following Equation 16 may be derived.

$\begin{matrix} {\frac{{E\left\lbrack e_{n + 1}^{2} \right\rbrack}}{B_{n}} = {{{- 2}\left( {1 - {{MK}_{bbpd}B_{n}\theta_{pr}}} \right){MK}_{bbpd}\theta_{pr}{E\left\lbrack e_{n}^{2} \right\rbrack}} + {\left( {{\frac{9}{16}\frac{{M\left( {M + 1} \right)}\left( {{2M} + 1} \right)}{3}} - \frac{{M\left( {M - 1} \right)}\left( {M + 1} \right)}{3}} \right)B_{n}K_{bbpd}^{2}\theta_{pr}^{2}{E\left\lbrack W^{2} \right\rbrack}} + {\frac{25}{8}{MB}_{n}K_{bbpd}^{2}\theta_{pr}^{2}{E\left\lbrack N^{2} \right\rbrack}}}} & \left\lbrack {{Equation}\mspace{14mu} 16} \right\rbrack \end{matrix}$

Optimum Kalman gain B_(n) satisfying dE[e_(n+1) ²]/dB_(n)=0 may be expressed as the following Equation 17.

$\begin{matrix} {B_{n} = {\frac{1}{K_{bbpd}\theta_{pr}}\frac{E\left\lbrack e_{n}^{2} \right\rbrack}{{{ME}\left\lbrack e_{n}^{2} \right\rbrack} + {\lambda \; {E\left\lbrack W_{n}^{2} \right\rbrack}} + {\frac{25}{16}{E\left\lbrack N_{n}^{2} \right\rbrack}}}}} & \left\lbrack {{Equation}\mspace{14mu} 17} \right\rbrack \end{matrix}$

By substituting Equation 17 into Equation 14 for simplicity, the following Equation 18 may be derived.

E[e _(n+1) ²]=(1−MB _(n) K _(bbpd) θpr)E[e _(n) ² ]+ME[W _(n) ²]  ]Equation 18]

Equation 17 and Equation 18 may yield the recursive procedure that constitutes the Kalman filtering algorithm. The steady state MSE may be expressed as the following Equation 19.

$\begin{matrix} {{E\left\lbrack e_{\infty}^{2} \right\rbrack} = \frac{{{ME}\left\lbrack W^{2} \right\rbrack} + \sqrt{{M^{2}{E\left\lbrack W^{2} \right\rbrack}^{2}} + {4\eta \; {E\left\lbrack W^{2} \right\rbrack}}}}{2}} & \left\lbrack {{Equation}\mspace{14mu} 19} \right\rbrack \\ {where} & \; \\ {{E\left\lbrack W_{n}^{2} \right\rbrack} = {E\left\lbrack W^{2} \right\rbrack}} & \; \\ {and} & \; \\ {{E\left\lbrack N_{n}^{2} \right\rbrack} = {{E\left\lbrack N^{2} \right\rbrack}.}} & \; \end{matrix}$

Equation 19 may indicate the minimum MSE bound of a BB CDR.

FIG. 8 shows the analytical and simulated MSEs of a 1:8 demultiplexed BB CDR with Kalman gain according to an exemplary embodiment of the present invention. In this case, FIG. 8 shows the analytical and simulated MSEs of a 1:8 demultiplexed BB CDR under various gains with

${\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{UI}_{rms}}},$

and D=0. The theoretical and simulated results may show close agreement, and the MSEs may be minimized when the Kalman gains are applied.

As described above, implementation non-idealities such as latency in the loop filter and quantization noise from the phase rotator may be neglected for simplicity in the analysis. Control latency, however, may degrade the tracking performance of a CDR by decreasing the closed loop phase margin. Digitally controlled phase rotators may have limited resolution for the output phase. Reduced resolution may relax the complexity of a rotator while degrading the jitter performance of a CDR.

In case delay in the loop filter D may be nonzero, Equation 11 may be modified as the following Equation 20.

e _(n+1,1) =e _(n,1)+Σ_(k=2) ^(M) W _(n−1,k) +W _(n,1)−(Me _(n−D,1)+Σ_(k=2) ^(M)(M+1−k)W _(n−D−1,k)+Σ_(k=1) ^(M)(N _(n−D,k)+φ_(bbpd,n−D,k)))K _(bbpd) B _(n)θ_(pr).   [Equation 20]

According to Equation 20, K_(bbpd) can be approximated as K_(bbpd)=2q₀/(√{square root over (2π)}σ_(j))_(if)σ_(j)<B_(n)/K_(bbpd), where q₀ is ½, ⅓, abd ⅕ for D=0, 1 and 2, respectively. However, in the case of σ_(j)>B_(n)/K_(bbpd), K_(bbpd)=2/(√{square root over (2π)}σ_(j)) and may be independent of the loop delay.

In order to calculate the MSE under nonzero loop delay, the correlation between e_(n,1) and e_(n−D,1) may be considered.

FIG. 9 shows the simulated autocorrelation of e_(n,m) for various levels of parallelization, loop delays, and input accumulation jitters according to an exemplary embodiment of the present invention when gain B_(n), defined in Equation 17 is used.

In this case, FIG. 9 shows simulated autocorrelation of e_(n,m) with σ_(N)=0.158 Ul_(rms). The autocorrelation R_(e) _(n,m) (k), may given by E[err(MN+m)err(MN+m+k)], where err(Mn+m)=e_(n,m). It may demonstrate that e_(n,m) can be approximated as a make B_(n) white process except in the vicinity of the origin clearly. Small E[e_(n,1)e_(n−D,1)] may make B_(n) remain close to the optimum value under nonzero loop delay. Close examination of FIG. 9 may reveal that the slope of the autocorrelation near the origin is close to E[W²], irrespective of D and M. By using this observation result, the expectation value of e_(n,1)e_(n−D,1) can be approximated as the following Equation 21.

E[e _(n,1) e _(n−D,i)]≈E[e_(n,1) ² ]−MDE[W _(n) ²]  [Equation 21]

By using Equation 21, E[(e_(n,1)−MB_(Dn)K_(bbpd)θ_(pr)e_(n−D,1))²] may become the following Equation 22.

E[(e _(n,1) −MB _(Dn) K _(bbpd)θ_(pr) e _(n−D,1))²]=(1−MB _(Dn) K _(bbpd)θ_(pr))² E[e _(n,1) ²]+2M ² DB _(Dn) K _(bbpd)θ_(pr) E[W _(n) ²].   [Equation 22]

From Equation 20 and Equation 22, the recursive MSE equation with nonzero D may be expressed as the following Equation 23.

$\begin{matrix} {{{E\left\lbrack e_{n + 1}^{2} \right\rbrack} = {{\left( {1 - {{MK}_{bbpd}B_{Dn}\theta_{pr}}} \right)^{2}{E\left\lbrack e_{n}^{2} \right\rbrack}} + {\begin{pmatrix} {M + \left( {{\frac{9}{16}\frac{{M\left( {M + 1} \right)}\left( {{2M} + 1} \right)}{6}} - \frac{{M\left( {M - 1} \right)}\left( {M + 1} \right)}{6}} \right)} \\ {B_{Dn}^{2}K_{bbpd}^{2}\theta_{pr}^{2}} \end{pmatrix}{E\left\lbrack W^{2} \right\rbrack}} + {\frac{25}{16}{MB}_{Dn}^{2}K_{bbpd}^{2}\theta_{pr}^{2}{E\left\lbrack N^{2} \right\rbrack}} + {2M^{2}{DB}_{Dn}K_{bbpd}\theta_{pr}{E\left\lbrack W^{2} \right\rbrack}}}},} & \left\lbrack {{Equation}\mspace{14mu} 23} \right\rbrack \end{matrix}$

where B_(Dn) may denote the Kalman gain with loop delay. By taking a similar approach to Equation 16, Kalman gain B_(Dn) may be expressed as the following Equation 24.

$\begin{matrix} {B_{Dn} = {\frac{1}{K_{bbpd}\theta_{pr}}\frac{{E\left\lbrack e_{n}^{2} \right\rbrack} - {{MDE}\left\lbrack W_{n}^{2} \right\rbrack}}{{{ME}\left\lbrack e_{n}^{2} \right\rbrack} + {{ME}\left\lbrack e_{n}^{2} \right\rbrack} + {\frac{25}{16}{E\left\lbrack N_{n}^{2} \right\rbrack}}}}} & \left\lbrack {{Equation}\mspace{14mu} 24} \right\rbrack \end{matrix}$

The Kalman gain under control latency may be smaller than Equation 17, because only low frequency prediction error may be valid. By the way, in most cases, the tracking error may satisfy E[e_(n) ²]>>E[W²] in the locked condition, than B_(Dn)≈D_(n). By substituting Equation 24 into Equation 23, the following Equation 25 may be derived.

E[e _(n+1) ²]=(1−MB _(Dn) K _(bbpd)θ_(pr))E[e _(n) ²]+(M+M ² DB _(Dn) K _(bbpd)θ_(pr))E[W _(n) ²]  [Equation 25]

and the steady state MSE may be the following Equation 26.

$\begin{matrix} {{E\left\lbrack e_{\infty}^{2} \right\rbrack} = {\frac{\left( {{2D} + 1} \right){{ME}\left\lbrack W^{2} \right\rbrack}}{2} + \frac{\sqrt{{{M^{2}\left( {{4D} + 1} \right)}{E\left\lbrack W^{2} \right\rbrack}^{2}} + {4\eta \; {E\left\lbrack W^{2} \right\rbrack}}}}{2}}} & \left\lbrack {{Equation}\mspace{14mu} 26} \right\rbrack \\ {\mspace{79mu} {where}} & \; \\ {\mspace{79mu} {{E\left\lbrack W_{n}^{2} \right\rbrack} = {E\left\lbrack W^{2} \right\rbrack}}} & \; \\ {\mspace{79mu} {and}} & \; \\ {\mspace{79mu} {{E\left\lbrack N_{n}^{2} \right\rbrack} = {{E\left\lbrack N^{2} \right\rbrack}.}}} & \; \end{matrix}$

Equation 26 may represent the generalized minimum MSE bound of a BB CDR. This bound may be equal to Equation 19, when D=0.

FIG. 10 shows the analytical and simulated MSEs with various loop gains according to an exemplary embodiment of the present invention. In this case, FIG. 10 shows the analytical and simulated MSEs of a 1:8 demultiplexed BB CDR with Kalman gain with σ_(w)=4π×10⁻⁴ Ul_(rms), D=2. The analytical results may match strongly with the simulated results, and it may be clear that the MSE is at a minimum when the optimum gain B_(Dn) is employed.

FIG. 11 shows the relationship between the minimum MSE bound and loop delay D under various demultiplexing ratios according to an exemplary embodiment of the present invention. In this case, FIG. 11 shows the relationship between the minimum MSE bound and loop delay D when

$\sigma_{W} = {\frac{0.6\pi}{\sqrt{2}} \times 10^{- 4}{UI}_{rms}}$

and σ_(N)=0.158 Ul_(rms). The minimum MSE bound may increase in proportion to D and M.

FIG. 12 shows B_(Dn)θ_(pr) versus σ_(N) and σ_(W) ² according to an exemplary embodiment of the present invention. In this case, FIG. 12 shows the optimum value of B_(Dn)θ_(pr) for the minimum MSE with respect to σ_(N) and σ_(W) ² in steady state. B_(Dn)θ_(pr) may be be inversely proportional to D and M and proportional to the variances of the non-accumulative period and accumulation jitter.

By substituting Equation 26 into Equation 24, the optimum forward gain B_(Dn)θ_(pr) may be given by the following Equation 27.

$\begin{matrix} {{B_{Dn}\theta_{pr}} = {\frac{\sqrt{2\pi}\sigma_{N}}{2}{\left( {{M\; \sigma_{W}^{2}} + \sqrt{{{M^{2}\left( {{4D} + 1} \right)}\sigma_{W}^{4}} + {4\eta \; \sigma_{W}^{2}}}} \right)/\begin{pmatrix} {{\left( {{2D} + 1} \right)M^{2}\sigma_{W}^{2}} +} \\ {{M\sqrt{{{M^{2}\left( {{4D} + 1} \right)}\sigma_{W}^{4}} + {4\eta \; \sigma_{W}^{2}}}} + {2{\lambda\sigma}_{W}^{2}} + {\frac{25}{8}\sigma_{N}^{2}}} \end{pmatrix}}}} & \left\lbrack {{Equation}\mspace{14mu} 27} \right\rbrack \end{matrix}$

In the case of σ_(W)<<σ_(N). √{square root over (M²(4D+1)σ_(W) ⁴+4ησ_(W) ²)}≈(5/2)σ_(W)σ_(N), and hence, Equation 27 can be simplified to Equation 28.

$\begin{matrix} {{B_{Dn}\theta_{pr}} \approx \frac{{M\; \sigma_{W}^{2}} + {\frac{5}{2}\sigma_{W}\sigma_{N}}}{{2M\; \sigma_{W}} + {\frac{5}{2}\sigma_{N}}}} & \left\lbrack {{Equation}\mspace{14mu} 28} \right\rbrack \end{matrix}$

By using a Taylor series, Equation 28 can be further simplified as given by the following Equation 29 and Equation 30.

$\begin{matrix} {{B_{Dn}\theta_{pr}} \approx {\left( {1 - \frac{2M\; \sigma_{W}}{5\sigma_{N}}} \right)\sigma_{W}}} & \left\lbrack {{Equation}\mspace{14mu} 29} \right\rbrack \\ {\approx \sigma_{W}} & \left\lbrack {{Equation}\mspace{14mu} 30} \right\rbrack \end{matrix}$

Because a PLL is designed to track the accumulation jitter, the forward gain, which represents the bandwidth of a PLL, may have to be mainly related to the accumulation jitter; the optimum bandwidth may be approximately the standard deviation of the step size of the accumulation jitter.

FIG. 13 shows the ratio between the output MSEs and optimum gain B_(Dn)θ_(pr) versus MDσ_(W) and σ_(N) according to an exemplary embodiment of the present invention. In this case, FIG. 13 shows the ratio between the output MSEs simulated with Equation 29, Equation 30, and optimum B_(Dn)θ_(pr). The MSE using Equation 29 may be greater than the minimum bound by 1% for σ_(N)>0.02 Ul_(rms) and MDσ_(W)<0.02 Ul_(rms). The MSE using Equation 30 may deviate even further from the minimum bound but the difference may be still less than 4% for σ_(N)>0.02 Ul_(rms) and MDσ_(W)<0.0 2 Ul_(rms).

FIG. 14 is a view illustration an example of the block diagram of the Kalman gain extractor according to an exemplary embodiment of the present invention. It may include an off-chip digital processor and an on-chip digital loop filter. The off-chip processor may receive phase update information from the CDR and output an estimated optimum Kalman gain obtained by extracting σ_(W) from the power spectral density (PSD) of the phase update information. For example, σ_(W) may be the standard deviation of step sizes of the accumulation jitter from the power spectral density (PSD) of the phase update information. The on-chip digital loop filter may consist of a cyclic accumulator which accumulates the phase detector's output, a gain multiplier and a phase interpolator (or DCO) controller.

In case the recovered clock phase may be locked to the input data, the operation of the clock path of a CDR may be similar with the delta modulation. The staircase approximation of the input jitter, φq,n may be given by the following Equation 31.

φ_(q,n)=φ_(q,n−1)+Σ_(m=1) ^(M)βθ_(pr) e _(n,m)   [Equation 31]

Assuming that the accumulation process starts at zero time, Equation 31 can be approximated as the following Equation 32.

φ _(q,n)=Σ_(i=1) ^(n)Σ_(m=1) ^(M)βθ_(pr) e _(i,m)   [Equation 32]

Therefore, the accumulator output Σ_(i=1) ^(n)Σ_(m=1) ^(M)βe_(i,m) can be considered as φ_(q,n)/θ_(pr).

FIG. 15 is a view illustration an example of the block diagram of the off chip digital processor according to an exemplary embodiment of the present invention. In this case, FIG. 15 shows the computational procedure of the optimum bandwidth estimator implemented by using an off chip digital processor. The off chip digital processor may include a storage register, a fast Fourier transform (FFT) processor and an optimum Kalman gain estimator. The storage register may store the phase update information, from which the FFT processor extracts the PSD of the absolute input jitter. The optimum Kalman gain estimator may calculate the optimum gain from the PSD of the accumulation jitter. The off chip digital processor may further include a gain calibrator to compensate for the variations in the transition density

The inputs of the off chip digital processor may be phase update information from the cyclic accumulator. The input signal may be stored in a storage register and then a scale factor θ_(pr) may be multiplied for the phase domain conversion. The UI-domain PSD of the accumulated phase noise, S(f) can be achieved by using a fast Fourier transform (FFT) algorithm. The standard deviation of σ_(W) can be estimated from S(f) since σ_(W)=√{square root over (4S(f)sin²(fπ/f_(Data)))}{square root over (4S(f)sin²(fπ/f_(Data)))}. Finally, a calibrator may multiply an architecture-dependent correction factor to σ_(W) considering the resolution of the phase rotator, data transition density and the gain reduction caused by under sampling.

Proper selection of f may be crucial because σ_(W) can be misinterpreted due to computational error at low frequencies and period jitter at high frequencies. The upper 3 dB corner frequency may be determined by the ratio between the variance of the period jitter and σ_(W) as given by

$f_{c} = {\frac{f_{Data}\sigma_{W}}{2{\pi\sigma}_{N}}.}$

The PSD retrieved from the FFT may be valid for frequencies greater than 1/N_(FFT)T_(S), where N_(FFT) and T_(S) may be the number of points in the FFT and the sampling period, respectively. In order to eliminate the low frequency computational error caused by the limited data storage, the frequency in excess of 10/N_(FFT)T_(S) may have to be chosen. Therefore, N_(FFT)>>10/(f_(c)T_(s)), the valid frequency range may be

$\frac{10}{N_{FFT}T_{S}} < f < {f_{c}.}$

The exemplary embodiments according to the present invention may be recorded in computer-readable media including program instructions to implement various operations embodied by a computer. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The media and program instructions may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts.

It will be apparent to those skilled in the art that various modifications and variation can be made in the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention cover the modifications and variations of this invention provided they come within the scope of the appended claims and their equivalents. 

What is claimed is:
 1. A clock and data recovery (CDR) apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor comprising: a clock generator configured to provide frequency locked clocks to a digitally controlled phase rotator; a bang-bang phase detector (BBPD); and a Kalman gain extractor configured to estimate an optimum Kalman gain for a loop filter connected to the BBPD and the phase rotator.
 2. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 1, wherein the Kalman gain extractor includes, an on-chip digital loop filter; and an off-chip digital processor to receive phase update information from the CDR apparatus and output the optimum Kalman gain.
 3. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 2, wherein the on-chip digital loop filter includes a cyclic accumulator, a gain multiplier and one of a phase interpolator controller and DCO controller.
 4. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 2, wherein the off-chip digital processor outputs the optimum Kalman gain obtained by extracting a standard deviation of step sizes of an accumulation jitter from power spectral density (PSD) of the phase update information.
 5. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 2, wherein the off-chip digital processor includes, a storage register configured to store the phase update information; and a fast Fourier transform (FFT) processor configured to extract PSD of an absolute input jitter from the phase update information.
 6. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 5, wherein the off-chip digital processor further includes, an optimum Kalman gain estimator configured to calculate the optimum Kalman gain from the PSD of an accumulation jitter.
 7. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 6, wherein the off-chip digital processor further includes, a gain calibrator configured to compensate for variations in transition density.
 8. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 1, wherein the Kalman gain extractor includes, a Kalman filter configured to find the optimum Kalman gain by minimizing a posterior MSE recursively.
 9. The CDR apparatus with adaptive optimum CDR bandwidth estimation by using a Kalman gain extractor of claim 1, wherein the BBPD includes a demultiplexer modeled by parallel BBPDs with a subsequent summation block. 